Offline Monte Carlo Tree Search for Statistical Model Checking of Markov Decision Processes
نویسندگان
چکیده
To find the optimal policy for large Markov Decision Processes (MDPs), where state space explosion makes analytic methods infeasible, we turn to statistical methods. In this work, we apply Monte Carlo Tree Search to learning the optimal policy for a MDP with respect to a Probabilistic Bounded Linear Temporal Logic property. After we have the policy, we can proceed with statistical model checking or probability estimation for the property against the MDP. We have implemented this method as an extension to the PRISM probabilistic model checker, and discuss its results for several case studies.
منابع مشابه
Solving Hidden-Semi-Markov-Mode Markov Decision Problems
Hidden-Mode Markov Decision Processes (HM-MDPs) were proposed to represent sequential decision-making problems in non-stationary environments that evolve according to a Markov chain. We introduce in this paper Hidden-Semi-Markov-Mode Markov Decision Processes (HS3MDPs), a generalization of HM-MDPs to the more realistic case of non-stationary environments evolving according to a semi-Markov chai...
متن کاملLearning in POMDPs with Monte Carlo Tree Search
The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult. Bayes-Adaptive Partially Observable Markov Decision Processes (BA-POMDPs) extend POMDPs to allow the model to be learned during execution. BA-POMDPs are a Bayesian RL approach that, in principle, allows for an optimal trade-off between exploitation an...
متن کاملApproaching Bayes-optimalilty using Monte-Carlo tree search
Bayes-optimal behavior, while well-defined, is often difficult to achieve. Recent advances in the use of Monte-Carlo tree search (MCTS) have shown that it is possible to act nearoptimally in Markov Decision Processes (MDPs) with very large or infinite state spaces. Bayes-optimal behavior in an unknownMDP is equivalent to optimal behavior in the known belief-space MDP, although the size of this ...
متن کاملEfficient Resource Allocation for Sparse Multiple Object Tracking
In this work we address the multiple person tracking problem with resource constraints, which plays a fundamental role in the deployment of efficient mobile robots for real-time applications involved in Human Robot Interaction. We pose the multiple target tracking as a selective attention problem in which the perceptual agent tries to optimize the overall expected tracking accuracy. More specif...
متن کاملLifting Techniques for Sequential Decision Making and Probabilistic Inference (Extended Abstract)
Many traditional AI algorithms fail to scale as the size of state space increases exponentially with the number of features. One way to reduce computation in such scenarios is to reduce the problem size by grouping symmetric states together and then running the algorithm on the reduced problem. The focus of this work is to exploit symmetry in problems of sequential decision making and probabili...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015